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relationship sepd. by different distances and capable of 

providing distinguishable fluorescence emission patterns upon excitation 
at a common wavelength. The subject labels find particular use 
in a variety of multi component anal, applications, such as probes in FISH 
and multi array analyses, as well as primers in nucleic acid enzymic 
sequencing applications. 
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DOCUMENT- IDENTIFIER: US 6210896 Bl 
TITLE: Molecular motors 



Brief S u mmary Text (4) : 

Polymers are involved in diverse and essential functions in living systems. The 
ability to decipher the function of polymers in these systems is integral to the 
understanding of the role that the polymer plays within a cell. Often the function of 
a polymer in a living system is determined by analyzing the structure and determining 
the relation between the structure and the function of the polymer. By determining the 
primary sequence in a polymer such as a nucleic acid it is possible to generate 
expression maps, to determine what proteins are expressed, and to understand where 
mutations occur in a disease state. Because of the wealth of knowledge that may be 
obtained from sequencing of polymers many methods have been developed to achieve more 
rapid and more accurate sequencing methods. 

Bri e f Summary Tex t (10) : 

The present invention relates to methods and products for linear analysis of polymers. 
In particular the invention is based on molecular motors and their use for guiding 
polymer movement during linear analysis. Recently rapid methods for analyzing polymers 
using linear analysis techniques have been developed. Such methods are described in 
co-pending PCT patent application No. PCT/US98/ 03024 and U.S. Ser. No. 09/134,411, the 
entire contents of which are hereby incorporated by reference. The method for 
analyzing polymers described in PCT/US98/ 03024 is based on the ability to examine each 
unit of a polymer individually. By examining each unit individually the type of unit 
and the position of the unit on the backbone of the polymer can be identified. This 
can be accomplished by positioning a unit at a station and examining a change which 
occurs when that unit is proximate to the station. The change can arise as a result of 
an interaction that occurs between the unit and the station or a partner and is 
specific for the particular unit. For instance if the polymer is a nucleic acid 
molecule and a T is positioned in proximity to a station a change which is specific 
for a T could occur. If on the other hand, a G is positioned in proximity to a station 
then a change which is specific for a G could occur. The specific change which occurs, 
for example, depends on the station used, the type of polymer being studied, and/or 
the label used. For instance the change may be an electromagnetic signal which arises 
as a result of the interaction. 

Brief S u mmary Text (15) : 

The polymer may be any type of polymer of linked units. The type of molecular motor 
which can be used, however, will depend on the type of polymer. In one embodiment the 
polymer is a nucleic acid and the molecular motor is a polymerase. In another 
embodiment the polymer is a peptide and the molecular motor is a myosin. 

Bri ef Summary Text (19) : 

The molecular motor tethered to the support may be any type of molecular motor. 
Preferably the molecular motor is a nu cleic acid molecular motor or a peptide 
molecular motor selected from the group consisting of polymerase, helicase, kinesin, 
dynein, actin, and myosin. 

Brief S ummary Text (20) : 

According to another aspect of the invention a molecular motor is provided. The 
molecular motor includes an agent positioned in interactive proximity with a signal 
station of the molecular motor, wherein the agent is selected from the group 
consisting of an electromagnetic radiation source, a quenching source, and a 
fluorescence excitation source. In one embodiment the molecular motor is in a 
solution. In another embodiment, the solution includes only a single molecular motor. 
Preferably the molecular motor is a nucleic acid molecular motor. 



1 r t '\ 



Record Display Form 



http://wcstbrs: 8002/bin/sate.exe?f^doc 1 &.. .IC&p_Messagc=&p_doccnt- 1 &p_doc_ 1 =PTFK WIC 



Brief Summary Text (23) : 

The m'olecular motor can be a nucl eic acid molecular motor or a peptide molecular 
motor. One type of nucleic acid molecular motor is a polymerase. 

Brief Summary T ext (30) : 

The plurality oiE polymers may be any type of polymer but preferably is a nucleic acid . 
In one embodiment the plurality of polymers is a homogenous population. In another 
embodiment the plurality of polymers is a heterogenous population. The polymers can be 
labeled, randomly or non randomly. Different labels can be used to label different 
linked units to produce different polymer dependent impulses. 

Br ief Summary Text (34) : 

in one embodiment the polymer dependent impulses are optically detectable. In another 
embodiment the nu cleic acids are labeled with an agent selected from the group 
consisting of an electromagnetic radiation source, a quenching source, a fluorescence 
excitation source, and a radiation source. 

Bri ef Summary Text (35) : 

The plurality of polymers may be any type of polymer but preferably is a nucleic acid . 
In one embodiment the plurality of polymers is a homogenous population. In another 
embodiment the plurality of polymers is a heterogenous population. The polymers can be 
labeled, randomly or non randomly. Different labels can be used to label different 
linked units to produce different polymer dependent impulses. 

Brief Summary Text (37) : 

The plurality of polymers may be any type of polymer but preferably is a n ucleic acid . 
In one embodiment the plurality of polymers is a homogenous population. In another 
embodiment the plurality of polymers is a heterogenous population. The polymers can be 
labeled, randomly or non randomly. Different labels can be used to label different 
linked units to produce different polymer dependent impulses. 



Drawing Description Text (8) : 

SEQ. ID. NO. 1 is a hypothetical nucleic acid sequence. 
Drawing Description Text (9) : 

SEQ. ID. NO. 2 is a hypothetical nucl e ic acid sequence. 
Dra wing Description Text (10) : 

SEQ. ID. NO. 3 is a hypothetical n ucl e ic acid sequence. 
Drawing Description Text (11) : 

SEQ. ID. NO. 4 is a hypothetical nu cleic acid sequence. 



D etailed Description Text (6) : 

The unit and the agent are moved relative to one another by a molecular motor. A 
"molecular motor" as used herein is a biological molecule which physically interacts 
with a polymer and moves the polymer past a signal station. Preferably the molecular 
motor is a molecule such as a protein or protein c omplex that interacts with a polymer 
and moves with respect to the polymer along the length of the polymer. The molecular 
motor interacts with each unit of the polymer in a sequential manner. The physical 
interaction between the molecular motor and the polymer is based on molecular forces 
occurring between molecules such as, for instance, van der waals forces. The type of 
molecular motor useful according to the methods of the invention depends on the type 
of polymer being analyzed. For instance a molecular motor such as e.g., a DNA 
polymerase or a helicase is useful when the polymer is DNA, a molecular motor such as 
RNA polymerase is useful when the polymer is RNA, and a molecular motor such as myosin 
is useful for example when the polymer is a peptide such as actin. Molecular motors 
include, but are not limited to, helicases, RNA polymerases, DNA polymerases, kinesin, 
dynein, actin, and myosin. Those of ordinary skill in the art would easily be able to 
identify other molecular motors useful according to the invention, based on the 
parameters described herein. 

Detailed Description Text (9) : 

Another preferred type of molecular motor is a helicase. Helicases have previously 
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been described, e.g., see U.S. Pat. No. 5,888,792. Helicases are proteins which move 
along nucle ic acid backbones and unwind the nu cleic acid so that the processes of DNA 
replication, Repair, recombination, transcription, mRNA splicing, translation and 
ribosomal assembly can take place. Helicases include both RNA and DNA helicases. 

Detaile d Description Text (12) : 

The molecular motors of the invention fall into two categories, n ucleic acid molecular 
motors and protein molecular motors. Nucleic a cid molecular motors include those 
molecular motors that move along the backbone of a nucleic acid molecule and include, 
for instance, polymerases and helicases. The protein molecular motors move along the 
backbone of a protein or peptide, and include for instance kinesin, dynein, actin and 
myosin. In some embodiments the molecular motor is preferably a nuc l eic acid molecular 
motor and in other embodiments it is preferably a protein molecular motor. 

D etai le d Description Text (14) : 

The method of the invention is described with respect to the following non-limiting 
example, which is provided for illustrative purposes only. The example refers to the 
analysis of DNA and fluorescence, but those of ordinary skill in the art would 
understand that it is applicable to all polymers and all claimed detection systems. In 
the example, a DNA polymerase is labeled with several fluorescent molecules, e.g. 
donor fluorescent molecules. A DNA molecule labeled with a matching fluorophore, e.g. 
an a ccep tor fluorophore, is then used as a template for the DNA polymerase which 
begins to "undergo primer extension. As the accept or fluorophore moves past the donor 
fluorophore, fluorescence resonance energy transfer (FRET) occurs. FRET occurs when 
the donor and acceptor fluorophores undergo a close range interaction in the range of 
approximately 1 angstrom to 100 angstroms. This distance is achieved when a single 
nucleotide with a label passes the fluorophore on the polymerase. 

Detailed Description Text (15) : 

FRET analysis using molecular motors can be performed on single molecules in solution 
or as parallel reactions on a solid planer medium. It may also be performed in 
parallel reactions in different solutions such as in multi-well dishes. In the 
embodiment in which the reaction is carried out on a planer solid medium, either the 
labeled polymer or the labeled molecular motor may be immobilized directly or through 
a linker onto the surface. If the polymer is attached to the surface, then molecular 
motor can be added subsequently and if the molecular motor is tethered to the surface, 
then the polymer may be added to initiate the reaction. In this manner, simultaneous 
linear reading of mul t iple donor- accepto r reaction sites can occur to enhance the 
throughput of the system. When the molecular motor is a DNA polymerase, the sequence 
of several kilobases of DNA can be obtained rapidly. The approximate rate of 
sequencing can approach 1 megabase/hour with a 1 camera system. 

Detailed D e scription Text (16) : 

The preparation of f luorescently labeled enzyme and protein complexes which can serve 
as molecular motors, is well known in the art. The availability of multiple amine, 
carboxyl, and sulflfydryl sites on enzymes makes conjugation of labels to these 
molecules straightforward. Many proteins have been f unctionalized to produce 
fluorescent derivatives without loss of activity, including, for instance, antibodies, 
horseradish peroxidase, glucose oxidase, b-galactosidase, alkaline phosphatase, actin, 
and myosin. Molecular motors can be easily derivatized in a similar manner, without 
losing functional activity. Additionally, labels can be incorporated into the polymer 
using methods known in the art, such as those described in U.S. Ser. No. 09/134,411. 
For instance, the label can be incorporated into the polymer using commercially 
available nucleotide or amino acid polymers or as succinimydal ester derivatives which 
can be linked to primary amino groups. 

Deta iled Description Text (19) : 

In a preferred embodiment the fluorescent dye and its energy transfer pair is 
carefully selected to maximize signal production. This can be accomplished by 
considering the parameters described by the formula set forth below. F luorescence 
ene rgy transfe r (FRET) directly related to the spectral overlap of the donor 
fluorescence emission and the acceptor fluorescence absorbance is determined as J, the 
normalized spectral overlap of the donor emission (fD) acceptor absorption 
( .epsilon. .sub. A) , 90 is the quantum efficiency (or quantum yield) for donor emission 
in the absence of acceptor (90 is the number of photons emitted divided by number of 
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photons absorbed), n is the index of refraction (typically 1.3-1.4), and 
{ . kappa . C . sup . 2 is a geometric factor related to the relative angle of the two 
transition dipoles . The equation which summarizes the importance of the normalized 
spectral overlap is given as: 

Detaile d Description Text (20) : 

The J factor is especially important in the determination of the Forster energy 
transfer distance which is the distance at which energy transfer from donor 
fluorophore to acceptor fluorophore is 50%. The Forster distance also determines the 
resolution of the^RET sequencing method. In general the Forster distance can be 
varied to be between as small as 5 angstroms and 100 angstroms. 

Detaile d Description Tex t (21) : 

We have considered these variables in our choice of the optimal donor -acce ptor pair 
for use in our FRET sequencing system. The J factor is important, but there are 
additional factors which should be worked into the system for optimal performance such 
as 1) the sharpness of the spectral bands, 2) the lack of crosstalk between the 
spectral bands, 3) the ability to immobilize the chosen labels in a polymeric matrix, 
and 4) the ability to have a match with common labels used for incorporation into DNA. 



De tailed Description Text (22) : 

Other factors can be considered in choosing the proper fluorescent label pair. For 
instance, the spectral overlap of the labels should be sufficient for energy transfer. 
By minimalizing direct excitation of the acceptor fluorophore crosstalk in excitation 
levels can be avoided. Additionally, the emission of the donor fluorophore should not 
interfere with the detection band from the acce p tor fluorophore. In this manner, the 
measured fluorescent events will be suitable and indicative of the occurrence of 
energy transfer. Under ideal conditions, the donor and accepto r fluorescence is sharp 
and not subject to spectral broadening. Furthermore, there are considerations in the 
quantum yield, photostability , and cross -sectional areas of the labels. All of these 
parameters can easily be manipulated by one of skill in the art based on the known 
properties of known and commercially available labels. 

D etailed Description Text (23) : 

Those of ordinary skill in the art can verify the extent of fluorescent labeling of 
the molecular motor and/or polymer. The level of fluorescence labeling in the 
fluorophore conjugated molecule is determined by either the absorbance or the 
fluorescence emission of the sample. The number of fluorophore molecules per molecule 
is called the F/M ratio. This value is measured for all preparations of 
enzyme -fluorophore complexes . The ideal F/M ratio is determined for the particular 
molecule (molecular motor or polymer) molecule-f luorophore combination. Using the 
known extinction coefficient of the fluorophore, a determination of the derivit ization 
level can be made after excess of the fluorophore is removed. 

D etai l ed Description Te xt (24) : 

The activity of the labeled molecular motors can be verified using standard assays 
which assess the viability of the molecular motor fluorophore c omplex after 
conjugation and purification. Various molecular motors have their own assays for 
activity verification. DNA polymerase and its activity after conjugation to FITC is 
discussed below to clarify further on this subject. This example is in no way limiting 
of the scope of the invention. 

Det ailed Descript i on Text (25) : 

DNA polymerase -fluorophore complexes are checked in dideoxy sequencing reactions to 
verify the ability of the modified molecular motor to perform its chain extension 
function. Primer annealing, labeling, and termination reactions are executed to 
determine the length of single- stranded, dideoxy terminated products and also to assay 
the base accuracy of the extended products. The reaction mixtures for the four 
dideoxynucleotides are subjected to four color automated capillary gel electrophoresis 
(such as the ABI 3770) for the final analysis. Match of the sequences with the known 
M13 ssDNA sequencing template confirms the integrity of the polymerase -fluorophore 
comp lexes . 

Detailed Description Text (26) : 
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FIG. 1 depicts an array of molecular motors (i.e. DNA polymerases) bound to the 
surface of a glass slide. The polymerases are labeled with donor fluorescent molecules 
which have emission spectra which partially overlap the excitation spectra of the 
accept or molecule. Template accepto r labeled polymer (i.e. DNA) is provided in the 
reaction mixture along with the appropriate extension primers. The reaction is 
initiated with a mixture of deoxynucleotides . The chain extension allows the acc eptor 
on the template DNA to be moved in proximity to the donors on the polymerase. Once the 
acceptor comes within energy transfer proximity to the donor on the immobilized 
polymerase molecule, non-radiative energy occurs. Sensitized fluorescence emission 
from the acceptor is induced. The temporally spaced fluorescence emission from the 
substrates allows for interrogation of the nucleotide information about the template 
molecule . 

Detailed Descripti on T ext (28) : 

Another example of the linear analysis method of the invention is depicted in FIG. 2. 
In the example the template may be fixed to the glass surface and the polymerase 
mobile in solution. As shown in FIG. 2, the donor fluorescence molecule may be located 
on the DNA molecule as opposed to the acceptor . The series of interactions may be 
mediated by a different molecular motor such as a helicase molecule which unwinds 
duplex DNA. In this scenario, the helicase molecule is f luorescently tagged and 
allowed to unwind compl exes which are asymmetrically labeled with the fluorescent 
molecules. The asymmetric labeling allows for the ease of deciphering the information 
about the polymer. 

Detai led Description Text (32) : 

The methods of the invention also are useful for identifying other structural 
properties of polymers. The structural information obtained by analyzing a polymer 
according to the methods of the invention may include the identification of 
characteristic properties of the polymer which (in turn) allows, for example, for the 
identification of the presence of a polymer in a sample or a determination of the 
relatedness of polymers, identification of the size of the polymer, identification of 
the proximity or distance between two or more individual units of a polymer, 
identification of the order of two or more individual units within a polymer, and/or 
identification of the general composition of the units of the polymer. Such 
characteristics are useful for a variety of purposes such as determining the presence 
or absence of a particular polymer in a sample. For instance when the polymer is a 
nucl eic acid the methods of the invention may be used to determine whether a 
particular genetic sequence is expressed in a cell or tissue. The presence or absence 
of a particular sequence can be established by determining whether any polymers within 
the sample express a characteristic pattern of individual units which is only found in 
the polymer of interest i.e., by comparing the detected signals to a known pattern of 
signals characteristic of a known polymer to determine the relatedness of the polymer 
being analyzed to the known polymer. The entire sequence of the polymer of interest 
does not need to be determined in order to establish the presence or absence of the 
polymer in the sample. Similarly the methods may be useful for comparing the signals 
detected from one polymer to a pattern of signals from another polymer to determine 
the relatedness of the two polymers. 

Detailed Description Text (38) : 

As used herein "similar polymers" are polymers which have at least one overlapping 
region. Similar polymers may be a homogeneous population of polymers or a heterogenous 
population of polymers. A "homogeneous population" of polymers as used herein is a 
group of identical polymers. A "heterogenous population" of similar polymers is a 
group of similar polymers which are not identical but which include at least one 
overlapping region of identical units. An overlapping region in a nucleic ac id 
typically consists of at least 10 contiguous nucleotides. In some cases an overlapping 
region consists of at least 11, 12, 13, 14, 15, 16, 17, 18, 19^ 20, 21, or 22 
contiguous nucleotides . 

Detailed Description Text (3 9) : 

A "polymer" as used herein is a compound having a linear backbone of individual units 
which are linked together by linkages. In some cases the backbone of the polymer may 
be branched. Preferably the backbone is unbranched. The term "backbone" is given its 
usual meaning in the field of polymer chemistry. The polymers may be heterogeneous in 
backbone composition thereby containing any possible combination of polymer units 
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linked together such as peptide- nucleic acids (which have amino acids linked to 
nucleic acids and have enhanced"stability) . ' In a preferred einbodiment the polymers are 
homogeneous in backbone composition and are, for example, nucleic acids, polypeptides, 
polysaccharides, carbohydrates, polyurethanes , polycarbonates, polyureas, 
polyethyleneimines , polyarylene sulfides, polysiloxanes , polyimides, polyacetates , 
polyamides, polyesters, or polythioesters . In the most preferred embodiments, the 
polymer is a nucleic acid or a polypeptide. A " nucle i c acid " as used herein is a 
biopolymer comprised of nucleotides, such as deoxyribose nucl eic acid (DNA) or ribose 
nucleic a cid (RNA) . A polypeptide as used herein is a biopolymer comprised of linked 
amino acids- 

Detailed Description Tex t (42) : 

Whenever a n ucleic acid is represented by a sequence of letters it will be understood 
that the nucleotides are in 5'.fwdarw.3' order from left to right and that "A" denotes 
adenosine, "C" denotes cytidine, "G" denotes guanosine, "T" denotes thymidine, and "U" 
denotes uracil unless otherwise noted. 

Detailed Description Text (45) : 

Many naturally occurring units of a polymer are light emitting compounds or quenchers. 
For instance, nucleotides of native n ucleic acid molecules have distinct absorption 
spectra, e.g., A, G, T, C, and U have absorption maximums at 259 nm, 252 nm, 267 urn, 
271 nm, and 258 nm respectively. Modified units which include intrinsic labels may 
also be incorporated into polymers. A nuclei c aci d molecule may include, for example, 
any of the following modified nucleotide units which have the characteristic energy 
emission patterns of a light emitting compound or a quenching compound: 
2 , 4-dithiouracil, 2 , 4 -Diselenouracil , hypoxanthine, mercaptopurine, 2 -aminopurine , and 
selenopurine . 

Detailed De s cription Text (51): 

As used herein the "relatedness of polymers" can be determined by identifying a 
characteristic pattern of a polymer which is unique to that polymer. For instance if 
the polymer is a nuclei c acid then virtually any sequence of 10 contiguous nucleotides 
within the polymer would be a unique characteristic of that nucleic acid molecule. Any 
other n ucleic aci d molecule which displayed an identical sequence of 10 nucleotides 
would be a related polymer. 

Detailed Description Text (67) : 

The information contained in the data and how it is analyzed depends on the number and 
type of labeled units that were caused to interact with the agent to generate signals. 
For instance if every labeled unit of a single polymer, each type of labeled unit 
(e.g., all the A's of a nucleic acid ) having a specific type of label, is labeled then 
it will be possible to determine from analysis of a single polymer the order of every 
labeled unit within the polymer. If, however, only one of the four types of units of a 
nucl eic acid is labeled then more data will be required to determine the complete 
sequence of the nucleic aci d. Additionally, the method of data analysis will vary 
depending on whether the nucleic acid is single stranded or double stranded or 
otherwise complexe d . Several labeling schemes and methods for analysis using the 
computer system data produced by those schemes are described in more detail below. 

Detailed Descrip t ion Text (69) : 

A four nucleotide labeling scheme can be created where the A's, C's, G's, and T's of a 
target DNA is labeled with different labels. Such molecule, if moved linearly past a 
station, will generate a linear order of signals which correspond to the linear 
sequence of nucleotides on the target DNA. The advantage of using a four nucleotide 
strategy is its ease of data interpretation and the fact that the entire sequence of 
labeled units can be determined from a single labeled nuc l eic aci d. Adding extrinsic 
labels to all four bases, however, may cause stearic hindrance problems. In order to 
reduce this problem the intrinsic properties of some or all of the nucleotides may be 
used to label the nucleotides. As discussed above, nucleotides are intrinsically 
labeled because each of the purines and pyrimidines have distinct absorption spectra 
properties. In each of the labeling schemes described herein the nucleotides may be 
either extrinsically or intrinsically labeled but it is preferred that at least some 
of the nucleotides are intrinsically labeled when the four nucleotide labeling method 
is used. It is also preferred that when extrinsic labels are used with the four 
nucleotide labeling scheme that the labels be small and neutral in charge to reduce 
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stearic hindrance. 

D etailed Description T ext (70) : 

A three nucleotide labeling scheme in which three of the four nucleotides are labeled 
may also be performed. When only three of the four nucleotides are labeled analysis of 
the data generated by the methods of the invention is more complicated than when all 
four nucleotides are labeled. The data is more complicated because the number and 
position of the nucleotides of the fourth unlabeled type must be determined 
separately. One method for determining the number and position of the fourth 
nucleotide utilizes analysis of two different sets of labeled nucleic acid molecules. 
For instance, one n ucleic acid molecule may be labeled with A, C, and G, and another 
with G, and T. Analysis of the linear order of labeled nucleotides from the two 
sets yields sequence data. The three nucleotides chosen for each set can have many 
different possibilities as long as the two sets contain all four labeled nucleotides. 
For example / the set ACG can be paired with a set of labeled CGT, ACT or AGT. 

Detailed Descri p tion Text (71) : 

The sequence including the fourth nucleotide also may be determined by using only a 
single labeled nucleic aci d rather then a set of at least two differently labeled 
nucleic acid s using a negative labeling strategy to identify the position of the 
fourth nucleotide on the nucleic acid . Negative labeling involves the identification 
of sequence information based on units which are not labeled. For instance, when three 
of the nucleotides of a nucleic acid molecule are labeled with a label which provides 
a single type of signal, the points along the nucleic acid backbone which are not 
labeled must be due to the fourth nucleotide. This can be accomplished by determining 
the distance between labeled nucleotides on a nucleic acid molecule. For example A, C, 
and G are labeled and the detectable signals generated indicated that the nucleic a cid 
molecule had a sequence of AGGCAAACG (SEQ. ID. No. 1) . If the distances between each 
of the nucleotides in the nucleic acid molecule are equivalent to the known 
inter-nucleotide distance for a particular combination of nucleotides except the 
distance between G and G is twice the normal inter-nucleotide distance then a T is 
positioned between the two G's and the entire molecule has a sequence of AGTGCAAACG 
(SEQ. ID. No. 2) . The distance between nucleotides can be determined in several ways. 
Firstly, the nucleic acid and the station may be moved relative to one another in a 
linear manner and at a constant rate of speed such that a single labeled unit of the 
nucleic acid molecule will pass the station at a single time interval. If two time 
intervals elapse between detectable signals then the unlabeled nucleotide which is not 
capable of producing a detectable signal is present within that position. This method 
of determining the distance between labeled units is discussed in more detail below in 
reference to random one nucleotide labeling. Alternatively the nucleic acid and the 
station may be caused to interact with one another such that each labeled unit 
interacts simultaneously with a station to produce simultaneous detectable signals. 
Each detectable signal generated occurs at the point along the nucleic acid where the 
labeled unit is positioned. The distance between the detectable signals can be 
calculated directly to determine whether an unlabeled labeled unit is positioned 
anywhere along the nucleic ac id molecule. 

Detai le d Description Text (72) : 

Nucleic acid molecules may also be labeled according to a two nucleotide labeling 
scheme. Six sets of two nucleotide labeled nucleic acid molecule can be used to 
resolve the data and interpret the nucleotide sequence. Ambrose et al . , 1993 and 
Harding and Keller, 1992 have demonstrated the synthesis of large fluorescent DNA 
molecules with two of the nucleotides completely extrinsically labeled. The average 
size of the molecules were 7 kilobases. Six different combinations of two nucleotide 
labeling are possible using the following formula: ##EQU1## 

D etailed Description Tex t (73) : 

where n nucleotides are taken k at a time. The possible combinations are AC, AG, AT, 
CG, CT, and GT . Knowledge of the linear order of the labels in each of the sets allows 
for successful reconstruction of the n ucleic acid sequence. Using a 4-mer (5'ACGT'3) 
as a model sequence, the theory can be demonstrated. The first set, AC, gives the 
information that there must be a C after the A. This does not give information about 
the number of nucleotides intervening the A and the C nor does it give information 
about any G's or T's preceding the A. The second set, AG, shows that there is also a G 
after the A. Set AT shows there is a T after the A. From these three sets, it is then 
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known that the target DNA is a 4-mer and that one C, one G, and one T follow the A. 
The subsequent sets give information on the ordering of these three nucleotides 
following the A. Set CG shows that G follows C. Set CT shows that T follows C. Set GT 
finishes the arrangement to give the final deciphered sequence of 5'ACGT'3 (SEQ ID NO. 
4) . In addition to the method using six labeled sets of nucleic acid molecules, the 
sequence can be established by combing information about the distance between labeled 
nucleotides generating detectable signals as described above and information obtained 
from fewer than six sets of two nucleotide labeled n ucleic acid molecules. 

Detailed Des c ription Text (79) : 

In the population method the entire population of labeled nucleotide is considered. 
Knowledge of the length of the localized region of the agent and instantaneous rate, 
as required for the rate method, is not necessary. Use of population analyses 
statistically eliminates the need for precision measurements on individual nuclei^ 
acid molecules. 

D etailed Description Text (80) : 

An example of population analyses using five nucle ic acid molecules each traversing a 
nanochannel is described below. Five molecules representing a population of identical 
DNA fragments are prepared. In a constant electric field, the time of detection 
between the first and second labeled nucleotide should be identical for all the DNA 
molecules. Under experimental conditions, these times differ slightly, leading to a 
Gaussian distribution of times. The peak of the Gaussian distribution is 
characteristic of the distance of separation (d) between two labeled nucleotides. 

D etailed Description Text (81) : 

An additional example utilizing a population of one nucleotide randomly labeled 
nucl e ic acid molecule (six molecules represent the population) further illustrates the 
concept of population analysis and the determination of distance information. The 
nu cleic acid is end-labeled to provide a reference point. With enough nucleic acid 
molecules, the distance between any two A's can be determined. Two molecules, when 
considered as a sub-population, convey the nucleotide separation molecules, 
distributions of 4 and 6 nucleotide separations are created. Extending the same logic 
to rest of the population, the positions of all the A's on the DNA can be determined. 
The entire sequence is generated by repeating the process for the other three bases 
(C, G, and T) . 

Detailed Description Text (82) : 

In addition to labeling all of one type of labeled unit in the above-described 
examples, it is possible to use various labeling schemes where not every nucleotide of 
the nucleotides or markers to be labeled is labeled, such as a one nucleotide labeling 
scheme where less than all of the one nucleotide are labeled. A representative 
population of random A- labeled fragments for a 16-mer with the sequence 
5 ' ACGTACGTACGTACGT ' 3 (SEQ. ID. No. 3) is used. Each individually labeled DNA molecule 
has half of its A's labeled in addition to 5' and 3' end labels. With a large 
population of randomly labeled fragments, the distance between every successive A on 
the target DNA can be found. The end labels serve to identify the distance between the 
ends of the DNA and the first A. Repeating the same analysis for the other nucleotides 
generates the sequence of the 16-mer by compiling the data to identify the position of 
all of the As within that population of nucleic acid molecules. These steps can then 
be repeated using labeled units for the other nucleotides in the population of nucleic 
acid s . The advantages of using such a method includes lack of steric effects and ease 
of labeling. This type of labeling is referred to as random labeling. A nucleic acid 
which is "randomly labeled" is one in which fewer than all of a particular type of 
labeled unit are labeled. It is unknown which labeled units of a particular type of a 
randomly labeled nu cleic acid are labeled. 

D etailed Description Text (83) : 

A similar type of analysis may be performed by labeling each of the four nucleotides 
incompletely but simultaneously within a population. For instance, each of the four 
nucleotides may be partially labeled with its own labeled unit which gives rise to a 
different physical characteristic, such as color, size, etc. This can be accomplished 
to generate a data set containing information about all of the nucleotides from a 
single population analysis. For instance the method may be accomplished by partially 
labeling two nucleotide pairs at one time. Two nucleotide labeling is possible through 
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the lowering of steric hindrance effects by using labeled units which recognize the 
two nucleotides of a nuc leic acid strand and which contain a label such as a single 
fluorescent molecule Ambrose et al . , 1993 and Harding and Keller, 1992 have 
demonstrated that large fluorescent nucle ic aci d molecules with two of the nucleotides 
completely labeled are possible to achieve. The average size of the molecules studied 
were 7 KB. Partial labeling of two and three nucleotides is possible. For instance, 
each of three nucleotides is partially labeled with a different labeled unit. In this 
case, a population of single stranded nucleic aci d molecules which are partially 
labeled with three specific nucleotide pair combinations is generated and can be 
analyzed. 

Detailed Description Text (84) : 

The methods of the invention can also be achieved using a double stranded nucleic 
acid. In a double stranded nucleic acid, when a single nucleotide on two of the 
strands is labeled, information about two nucleotides becomes available for each of 
the strands. For instance, in the random and partial labeling of A's, knowledge about 
the A's and T's becomes available. For example, a labeling strategy in which two 
differently labeled nucleic aci d samples can be prepared. The first sample has two 
non-complimentary nucleotides randomly labeled with the same fluorophore. 
Non-complimentary pairs of nucleotides are AC, AG, TC, and TG. The second sample has 
one of its nucleotides randomly labeled. The nucleotide chosen for the second sample 
may be any one of the four nucleotides. In the example provided, the two 
non- complimentary nucleotides are chosen to be A and C, and the single nucleotide is 
chosen to be A. Two samples are prepared, one with labeled A's and C's and another 
with labeled A's. The nucleic acid is genomically digested, end labeled, purified, and 
analyzed. Such procedures are well-known to those of ordinary skill in the art. The 
information from each fragment is sorted into one of two complimentary strand groups. 
Sorting the information into different groups allows the population analysis to 
determine the positions of all the desired nucleotides. The first group of data 
provides known positions of all the A's and C ' s on one strand. The second group of 
data provides known positions of all of the A's. The combination of these two data 
sets reveals the position of all of the A's and C's on one strand. The same procedure 
may be applied to the complimentary strand to determine the positions of the A's and 
C's on that strand. The resultant data reveals the entire sequence for both strands of 
the nuclei c ac id, based on the assumption that the strand includes the complimentary 
nucleotide pairs of A and C (A:T and C:G) . To cross-verify the sequence, the process 
can be repeated for the other pairs of non-complimentary nucleotides such as TG, TC 
and AG. 

Detailed Description Text (86) : 

By including more than one physical characteristic into the label, the simultaneous 
and overlapping reading of the nucleic acid within the same temporal frame may provide 
more accurate and rapid information about the positions of the labeled nucleotides 
than when only a single physical characteristic is included. The sample may be, for 
instance, labeled with different wavelength f luorophores . Each of the fluorophores can 
be detected separately to provide distinct readings from the same sample. For 
instance, the end units of a polymer may be labeled with fluorophores which emit at a 
first wavelength and a set of internal units may be labeled with a fluorophore which 
emits at a second wavelength. As the polymer is moved past the signal station both 
wavelengths can be detected to provide information about both sets of labels. 

De t ailed Description Text (87) : 

One use for the methods of the invention is to determine the sequence of units within 
a polymer. Identifying the sequence of units of a polymer, such as a nucleic aci d, is 
an important step in understanding the function of the polymer and determining the 
role of the polymer in a physiological environment such as a cell or tissue. The 
sequencing methods currently in use are slow and cumbersome. The methods of the 
invention are much quicker and generate significantly more sequence data in a very 
short period of time. 

Detailed Description Text (90) : 

The interaction station in a preferred embodiment is a region of a molecular motor 
where a localized agent, such as an acceptor fluorophore, attached to the molecular 
motor or support can interact with a polymer passing through the molecular motor. The 
point where the polymer passes the localized region of agent is the interaction 
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station. As each labeled unit of the polymer passes by the agent a detectable signal 
is generated. The agent may be localized within the region of the channel in a variety 
of ways. For instance the agent may be physically attached to the molecular motor, 
directly or by a linker, at the site where the polymer interacts with the molecular 
motor. Alternatively, the molecular motor may be attached to a support and the agent 
may also be attached to the support, as long as the agent is attached to a region of 
the support by which all units of the polymer will pass. For instance, the agent may 
be embedded in a material or on the surface of a material that forms the wall of a 
channel wherein the molecular motor is attached to the wall and moves the polymer 
through the channel. Alternatively the agent may be a light source which is positioned 
a distance from the molecular motor or support but which is capable of transporting 
light directly to a region of the channel through a waveguide. These and other related 
embodiments of the invention are discussed in more detail below. The movement of the 
polymer may be assisted by the use of a groove or ring to guide the polymer. 

D etailed D e scri pt ion Text (98) : 

A variation of these types of interaction involves the presence of a third element of 
the interaction, a proximate compound which is involved in generating the signal. For 
example, a labeled unit may be labeled with a light emissive compound which is a donor 
fluorophore and a proximate compound can be an ac cept or fluorophore. If the light 
emissive compound is placed in an excited state and brought proximate to the accep tor 
fluorophore, then energy transfer will occur between the donor and acceptor, 
generating a signal which can be detected as a measure of the presence of the labeled 
unit which is light emissive. The light emissive compound can be placed in the 
"excited" state by exposing it to light (such as a laser beam) or by exposing it to a 
fluorescence excitation source. 

Detailed Description Text (100) : 

A set of interactions parallel to those described above can be created wherein, 
however, the light emissive compound is the proximate compound and the labeled unit is 
either a quenching source or an ac cept or source. In these instances the agent is 
electromagnetic radiation emitted by the proximate compound, and the signal is 
generated, characteristic of the interaction between the labeled unit and such 
radiation, by bringing the labeled unit in interactive proximity with the proximate 
compound . 

D etailed Description Text (101) : 

The mechanisms by which each of these interactions produces a detectable signal is 
known in the art. For exemplary purposes the mechanism by which a donor and a cceptor 
fluorophore interact according to the invention to produce a detectable signal 
including practical limitations which are known to result from this type of 
interaction and methods of reducing or eliminating such limitations is set forth 
below. 

Detailed Des c ription Text (105) : 

Analysis of the radiolabeled polymers is identical to other means of generating 
signals. For example, a sample with radiolabeled A's can be analyzed by the system to 
determine relative spacing of A's on a sample DNA. The time between detection of 
radiation signals is characteristic of the polymer analyzed. Analysis of four 
populations of labeled DNA (A's, C's, G's, T's) can yield the sequence of the nucleic 
acid analyzed. The sequence of DNA can also be analyzed with a more comple x scheme 
including analysis of a combination of dual labeled DNA and singly labeled DNA. 
Analysis of a and C labeled fragment followed by analysis of a labeled version of the 
same fragment yields knowledge of the positions of the A's and C's. The sequence is 
known if the procedure is repeated for the complementary strand. The system can 
further be used for analysis of polymer (polypeptide, RNA, carbohydrates, etc.), size, 
concentration, type, identity, presence, sequence and number. 

Detailed Description Text (107) ; 

In another preferred embodiment the signal generated by the interaction between the 
labeled unit and the agent results from fluorescence resonance energy transfer (FRET) 
between f luorophores . Either the labeled unit or the proximate compound/ agent may be 
labeled with either the donor or acceptor fluorophore. FRET is the transfer of 
photonic energy between f luorophores . FRET has promise as a tool in characterizing 
molecular detail because of its ability to measure distances between two points 
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separated by 10 .ANG. to 100 .ANG. . The angstrom resolution of FRET has been used in 
many studies of molecular dynamics and biophysical phenomena (for reviews see Clegg, 
1995; Clegg, 1992; Selvin, 1995; and Wu and Brand, 1994). The resolving power of FRET 
arises because energy transfer between donor and acceptor fluorophores is dependent on 
the inverse sixth power of the distance between the probes. In practice, this 
resolution is about an order of magnitude better than that of the highest resolution 
electron microscope. 

Detailed Description Tex t (108) : 

in order to undergo FRET, the emission spectrum of the donor overlaps with the 
excitation spectrum of the acceptor . The labeled unit of the polymer is specifically 
labeled with an acceptor fluorophore. The agent is a donor fluorophore. A laser is 
tuned to the excitation wavelength of the donor fluorophore. As the polymer is moved 
through the channel, the donor fluorophore emits its characteristic wavelength. As the 
ac ceptor fluorophore moves into interactive proximity with the donor fluorophore, the 
ac ceptor fluorophore is excited by the energy from the donor fluorophore. The 
consequence of this interaction is that the emission of the donor fluorophore is 
quenched and that of the accept or fluorophore is enhanced. 

Detailed Description Text (109) : 

In order to generate an optimal efficient FRET signal for detection, two conditions 
should be satisfied. The first condition is efficient donor emission in the absence of 
acceptors . The second is efficient generation of a change in either donor or acceptor 
emissions during FRET. Each of these are described in more detail in co-pending PCT 
Patent Application PCT/US98/03 024 and U.S. Ser. No. 09/134,411. 

Detailed Description Text (121) : 

As used herein a "material shield" is any material which prevents or limits energy 
transfer or quenching. Such materials include but are not limited to conductive 
materials, high index materials, and light impermeable materials. In a preferred 
embodiment the material shield is a conductive material shield. As used herein a 
"conductive material shield" is a material which is at least conductive enough to 
prevent energy transfer between donor and accepto r sources. 

D etailed Des cription Text (122) : 

A "conductive material" as used herein is a material which is at least conductive 
enough to prevent energy transfer between a donor and an accepto r . 

Det ailed Description Text (123) : 

A "nonconductive material" as used herein is a material which conducts less than that 
amount that would allow energy transfer between a donor and an acce ptor . 

Detailed Description Text (142) : 

In enzyme (molecular motor) or template (polymer) immobilization analysis of the 
co mplexes , a simplified reaction vessel is used which consists of high grade fused 
silica slide and coverslip. The enzyme or template is immobilized to the fused silica 
surface by different coupling means as discussed below. 

Detailed Descri p tion Text (145) : 

Immobilization of molecular motors is accomplished using methods known in the art for 
immobilizing proteins. Immobilization of RNA transcription complexes , for instance, 
has been reported by Schafer et al, 1991. Briefly, transcription comp lexes were placed 
between two lines of silicone vacuum grease in the center of a borosiiicate glass 
coverslip (Clay-Adams, No. 3223) in a humidified chamber and incubated for 15 minutes 
at 2 0. degree. C. To remove unbound complexes , the solution over the coverslip was 
partially replaced 10 times by the simultaneous withdrawal of 10 .mu.l solution and 
addition of 10 .mu.l PTC buffer, and BSA was then added to a concentration of 1 mg/ml . 
To initiate transcription, 1 mM each of ATP, CTP, OTP, UTP was allowed to flow into 
the chamber. Similar protocols can be followed for the immobilization of other 
biomolecules such as DNA polymerase and helicases. The binding of template DNA to the 
solid support may be accomplished by several means including streptavidin-biotin 
interaction or amine-succinimidyl ester linkage. These protocols are outlined 
beginning with the streptavidin-biotin interaction: 2.67 .mu.l of concentrated (15 
pmol/L) are added to 7.33 .mu.l of water. The streptavidin coated surface, obtained 
from Xenopore, N.J. is placed on a sponge in a petri dish in water. 10 .mu.l of the 
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template mixture is added to a 1 cm. sup. 2 area of slide. The petri dish is covered and 
incubated for one hour at room temperature. The glass is washed three times with 0 . IM 
sodium phosphate pH 7 . 2 containing 0.15M NaCl . For the amine -succinimidyl ester 
linkage, the protocol follows. The succinimidyl ester derivitized surfaces are 
obtained from Corning Life Sciences, Mass. 200 .mu.L of 25 picomolar DNA in phosphate 
buffer is added to the surface and incubated overnight at 4. degree. C. The surface is 
washed three times with Tween 2 0 in PBS. TE buffer (10 mM Tris, pH 8 , 1 mM EDTA) is 
added to block unreacted succinimidyl ester groups, incubating for 30 minutes at 
37, degree. C. The plate is washed three times with Tween 20 in PBS. 

Detailed Descri p tion Text (147) : 

Alternatively, the enzymes are fixed onto the surface using standard protein 
immobilization techniques (Schafer et al., 1991). Transcription complexes (molecular 
motors) are placed between two lines of silicone vacuum grease in the center of a 
borosilicate glass coverslip (Clay-Adams, No. 3223) in a humidified chamber and 
incubated for 15 minutes at 2 0. degree. C. To remove unbound complexes , the solution 
over the coverslip is partially replaced 10 times by the simultaneous withdrawal of 10 
.mu.l solution and addition of 10 .mu.l PTC buffer. BSA is then added to a 
concentration of 1 mg/ml. The concentration of the complexes is adjusted so that by 
probability, there is less than one complex per grid location. The grid locations in 
this technique are determined by a hydrophobic/hydrophilic patterning area. Only the 
hydrophilic regions of the chip are derivitized with the enzyme comp le xes . The 
hydrophilic regions are arranged in a densely packed grid pattern with more than a 
million grid locations per square centimeter. To create the hydrophilic patterned 
areas, a porous/gridded silicone mask (General Electric RTV 615A and 615B) is placed 
over the glass area and subject to a gentle oxygen plasma etch using 25 mmTorr 02 at 
25 seem flow rate, 30 W for 50 seconds. The areas exposed to the oxygen etch are 
rendered hydrophilic and amenable to site specific immobilization of the 
enzyme -fluorophore com plex es . 

Detailed Descri p tion Text (166) ; 

Single molecule data is generated from the molecular motor complex migrating along the 
strand of DNA, shown schematically. The data is streamed in real time to give 
information about the precise labeling strategy of the DNA molecule. Real-time 
sequence is generated from the pattern of the acceptor emissions. Excitation of the 
donor molecule gives rise to proximity excitation of the acceptor . Either the emission 
of the donor molecule is measured or the emission of the acceptor . Real-time increases 
in signal arises from measurement of the acceptor fluorescence and real-time decreases 
in signal arises from measurement of the donor fluorescence, shown in graph depicting 
Intensity vs Time. The raw information from single molecular analysis is directly 
correlated to the labeling strategy and the real-time output from the system. The 
information from the detectors is streamed to a databoard operating at the appropriate 
data rates. Software analysis in LabView or similar program yields quantitative data 
of DNA sequence information. 

CLAIMS : 

11. The method of claim 1, wherein the polymer is a nucleic acid and the molecular 
motor is a nucleic a cid molecular motor. 

12. The method of claim 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10, wherein the molecular motor 
is a nu cle ic acid molecular motor that is a polymerase. 

20. A nucleic ac id molecular motor, wherein the nucleic acid molecular motor includes 
an agent selected from the group consisting of an electromagnetic radiation source, a 
quenching source, and a fluorescence excitation source positioned in interactive 
proximity with a signal station of the nucleic acid molecular motor. 

21. A solution comprising the nucleic acid molecular motor of claim 20. 

22. The solution of claim 21, wherein the nucleic acid molecular motor is a 
polymerase . 

34. The method of claim 23, wherein the molecular motor is a nucleic acid molecular 
motor. 
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36. The method of claim 34, wherein the nucle ic acid molecular motor is a polymerase. 
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